Skip to content

[ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA#1245

Open
RolandFischbacher wants to merge 22 commits intomasterfrom
RF_com8MoTPSA
Open

[ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA#1245
RolandFischbacher wants to merge 22 commits intomasterfrom
RF_com8MoTPSA

Conversation

@RolandFischbacher
Copy link
Contributor

This PR introduces a new optimisation module ana6Optimisation for com8MoTPSA and updates the simulation workflow.


The module ana6Optimisation includes:

  • Calculation of combined loss function (runout + tversky)
  • Morris sensitivity analysis for parameter ranking
  • Sequential and non-sequential surrogate based optimisation routines

New files in ana6Opitmisaton:

  • runMorrisSA.py (configuration: runMorrisSACfg.ini)
  • runPlotMorrisConvergence.py (uses runMorrisSACfg.ini)
  • runOptimisation.py (configuration: runOptimisationCfg.ini)
  • optimisationUtils.py
  • README_ana6.md (contains usage instructions)

New file in out3Plot:

  • outAna6Plots.py

Changed workflow of runing com8MoTPSA:

  • Check prior if simulation is run already
  • Process in chunks not all simulations at once

@RolandFischbacher RolandFischbacher self-assigned this Feb 23, 2026
@RolandFischbacher RolandFischbacher added the enhancement New feature or request label Feb 23, 2026
@qltysh
Copy link
Contributor

qltysh bot commented Feb 23, 2026

❌ 1 blocking issue (1 total)

Tool Category Rule Count
black Style Incorrect formatting, autoformat by running qlty fmt. 1

@qltysh one-click actions:

  • Auto-fix formatting (qlty fmt && git push)

@fso42 fso42 added this to the Version 2.0 milestone Feb 23, 2026
@fso42 fso42 changed the title [ana6Opitmisation], [com8MoTPSA]: Add ana6Optimisation Module, apply changes in com8MoTPSA [ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA Feb 24, 2026
@fso42 fso42 assigned fso42 and unassigned RolandFischbacher Feb 24, 2026
Squash of 20 commits from RF_com8MoTPSA branch including:
- com8MoTPSA workflow improvements (chunked multiprocessing, path handling)
- Bayesian optimisation integration (ana6Optimisation module)
- Morris sensitivity analysis scripts
- AIMEC runout reference implementation
- probAna pickle saving and bounds
- Plotting and config improvements
@qltysh
Copy link
Contributor

qltysh bot commented Feb 24, 2026

Qlty

Coverage Impact

⬇️ Merging this pull request will decrease total coverage on master by 0.19%.

Modified Components (1)

RatingComponent% Diff
Coverage rating: C Coverage rating: C
com1DFA

Modified Files with Diff Coverage (2)

RatingFile% DiffUncovered Line #s
Coverage rating: F Coverage rating: F
avaframe/com8MoTPSA/com8MoTPSA.py0.0%21-182
Coverage rating: A Coverage rating: A
avaframe/in3Utils/cfgUtils.py71.4%995, 1016
Total10.0%
🤖 Increase coverage with AI coding...

In the `RF_com8MoTPSA` branch, add test coverage for this new code:

- `avaframe/com8MoTPSA/com8MoTPSA.py` -- Line 21-182
- `avaframe/in3Utils/cfgUtils.py` -- Lines 995 and 1016

🚦 See full report on Qlty Cloud »

🛟 Help
  • Diff Coverage: Coverage for added or modified lines of code (excludes deleted files). Learn more.

  • Total Coverage: Coverage for the whole repository, calculated as the sum of all File Coverage. Learn more.

  • File Coverage: Covered Lines divided by Covered Lines plus Missed Lines. (Excludes non-executable lines including blank lines and comments.)

    • Indirect Changes: Changes to File Coverage for files that were not modified in this PR. Learn more.

@fso42 fso42 assigned RolandFischbacher and unassigned fso42 Feb 24, 2026
- Add bounds to paramValuesD in createSamplesWithVariation (StandardParameters)
- Add writing of visualisation scenario and sampling method to com8MoTPSACfg.ini
Comment on lines +87 to +95
if 'VISUALISATION' in config.sections():
# config is inifile
index = config['VISUALISATION']['scenario']

if 'VISUALISATION' in config.sections():
# config is inifile
index = config['VISUALISATION']['scenario']
if 'sampleMethod' in config['VISUALISATION']:
sampleMethod = config['VISUALISATION']['sampleMethod']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if any .ini file in the config directory lacks a VISUALISATION section, the variables index and sampleMethod are never assigned, but they are unconditionally appended on lines 107-108. On the first such file, this raises UnboundLocalError. Even if subsequent iterations reuse a stale value from a previous file, the data would be silently wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the info. I will initialize index and sampleMethod with np.nan to avoid UnboundLocalError and stale values.

if modName.lower() in ["com1dfa", "com5snowslide", "com6rockavalanche", 'com8motpsa']:
cfgStart["VISUALISATION"]["scenario"] = str(count1)
cfgStart["INPUT"]["thFromIni"] = paramValuesD["thFromIni"]
cfgStart["VISUALISATION"]["sampleMethod"] = cfg['PROBRUN']['sampleMethod']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new line cfgStart["VISUALISATION"]["sampleMethod"] = cfg['PROBRUN']['sampleMethod'] reads sampleMethod from cfg['PROBRUN'].

However, probAnaCfg.ini has sampleMethod under [PROBRUN] only when probAna is the caller. If createCfgFiles is called from a different path where cfg doesn't have PROBRUN.sampleMethod, this will raise KeyError. The code also assumes VISUALISATION section exists in cfgStart for com8MoTPSA — while the new com8MoTPSACfg.ini does add it, there's no sampleMethod default there, making the flow dependent on the caller always providing this key.

Copy link
Contributor Author

@RolandFischbacher RolandFischbacher Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check if 'VISUALISATION' exists in cfgStart, if not, it will be added to cfgStart.
And read sample method with fallback; meaning if 'PROBANA' or 'sampleMethod' is missing sample_method contains an empty string.

@OpenNHM OpenNHM deleted a comment from qltysh bot Feb 27, 2026
@OpenNHM OpenNHM deleted a comment from qltysh bot Feb 27, 2026
…Files in optimisationUtils.py), add possibility to RUN writeCfgFiles with counter
…clean header

Implement modName to make code general and remove adjustText package
…ror and stale values and remove copy paste error.
… added to cfgStart.

And read sample method with fallback; meaning if 'PROBANA' or 'sampleMethod' is missing sample_method contains an empty string
… docstring, tidy up code, add if __name__='__main__': in runPlotMorrisConvergence.py and improve BOConvergencePlot.
Copy link
Contributor

@awirb awirb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first comments, more to come

# check for allConfigurationsInfo to find computation info and add to info fetched from ini files
if latest == False and isinstance(simDF, pd.DataFrame):
# check if in allConfigurationsInfo also info for existing sims
simDFALL, _ = readAllConfigurationInfo(avaDir, specDir="", configCsvName="allConfigurations")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you also require here to set the modName variable that it doesn't just read the allConfigurations.csv from the Outputs of com1DFA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allConfigurations.csv is currently not available for com8MoTPSA module

@@ -0,0 +1,64 @@
### Config File - This file contains the main settings for the optimisation process

# Sidenote: (1) when running runOptimisation.py the working directory needs to be in the ana6Optimisation folder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it actually located here and not in the runScripts directory?



[GENERAL]
# USER input for running and plotting a comparison of simulation result to reference polygon
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this refer to the settings in runPlotAreaProfile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this refers to the settings in runPlotAreaRefDiffs.py

[PARAM_BOUNDS]
# 2 scenarios: choose 1 or 2
scenario = 1
#(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation
#(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only needs to determine how much input parameters to use for optimisation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied locally

# 2 scenarios: choose 1 or 2
scenario = 1
#(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation
topN = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this parameter? add description

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and move description of second scenario up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this is just used if scenario 1 using the Morris analysis parameters and there the topN ranked ones right? now I see why the (2) scenario is explained below

Contains Cropshape and defines the maximal extent of runout area that is used for calculating areal indicators.

- **REFDATA**
Defines the runout area of the reference event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add info that this needs to have the suffix _POLY.shp (if polygon is required)

- **Digital Elevation Model (DEM)**
Must be placed directly in the `Inputs` directory and must cover the entire affected area.

More Details here: https://docs.avaframe.org/en/latest/moduleCom1DFA.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add #input

Copy link
Contributor Author

@RolandFischbacher RolandFischbacher Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that in provided link in section Inputs?


[GENERAL]
# USER input for running and plotting a comparison of simulation result to reference polygon
resType = ppr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add layer

# Conflicts:
#	avaframe/ana6Optimisation/README_ana6.md
Copy link
Contributor

@awirb awirb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments for the optimisation part and plotting still missing

- thresholdValueSimulation
- modName
avalancheDir : str
Directory containing the directory of the reference avalanche
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean just the path to the avalanche directory?

Copy link
Contributor Author

@RolandFischbacher RolandFischbacher Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changed the description:)

Comment on lines +55 to +56
cfgAIMEC = cfgUtils.getModuleConfig(ana3AIMEC)
rasterTransfo, resAnalysisDF, plotDict, _, pathDict = ana3AIMEC.fullAimecAnalysis(avalancheDir, cfgAIMEC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using the passed module, consider loading the config already in the runScript before you call the function and just pass the config, then also the override is easier (using ana3AIMEC_ana3AIMEC_override) or is there a special reason for passing the module?

Copy link
Contributor Author

@RolandFischbacher RolandFischbacher Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passing the module in to the calcArealIndicatorsAndAIMEC function is not necessary, since the AIMEC settings are not overridden, i think that not passing the module and loading config here should be sufficient?

)
raise ValueError(message)

paramLossSubsetDF = paramLossDF.sort_values(by='Loss', ascending=True)[:N]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the [:N] needed - is that from start to the end if len(DF) is N no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It defines how much of the best ranked morris samples to use for statistics. (e.g. parameter distribution within this topN samples). I changed the name to topN.


def createDFParameterLoss(df, paramSelected):
"""
Create DataFrames linking selected parameters with the loss function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does selected mean - the ones that were used for the parameter variation using the morris sampling method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selected depends on scenario: if morris is not run prior, then selected means all parameters that were varied, and if morris is run prior, selected means take only topN most important parameters

# beta gives penalty if sim. comes short compared to the ref. (FN)
tverskyAlpha = 2
tverskyBeta = 1
# Loss function is kombination of TverskyScore * weightTversky + RunoutNormalised * weight runout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Loss function is kombination of TverskyScore * weightTversky + RunoutNormalised * weight runout
# Loss function is a combination of TverskyScore * weightTversky + RunoutNormalised * weight runout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed locally

fU.makeADir(outDir)

# Get config from morris for path to morris results
cfgDir = 'runMorrisSA.ini'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the name runMorrisSACfg.ini?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or the runMorrisSA.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still works like this because it then uses the stem, so runMorrisSA to build the path to the config file, but I think it should be runMorrisSA.py

- The top-N most influential parameters are selected for optimisation.

Scenario 2 (Manual definition):
- No prior Morris screening.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood correctly, Morris analysis could have been performed previously to decide which parameters should be considered in the optimisation and which ones do not have a strong effect on the loss function and are therefore not considered? So scenario 2 just means that first simulations have to be performed to start the optimisation with or used from the ana4Prob run?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in contrast in scenario 1 the simulations performed for the morris analysis using the morris sampling are used directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the statement is correct.

def loadVariationData(cfgOpt, outDir, avaDir):
"""
Load parameter bounds and selected parameters for optimisation. Two execution modes are supported, controlled via
cfgOpt['PARAM_BOUNDS']['scenario']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the description of the scenarios below, both say that the parameter bounds are either read from sa_parameter_bounds.pkl (scenario 1) or from paramValuesD.pickle created in the runAna4ProbAna (scenario 2) - so cfgOpt['PARAM_BOUNDS'] is not used? consider mentioning already here that this relies on previous simulation runs performed using Morris analysis or ana4ProbAna

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfgOpt['PARAM_BOUNDS'] is used to determine which file is read, either sa_parameter_bounds.pkl or paramValuesD.pickle

# 2 scenarios: choose 1 or 2
scenario = 1
#(1): morris is run prior, then dataframe of ranked input parameters is already saved by runMorris.py as pickle file, user only need to determine how much input paramters to use for optimisation
topN = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also this is just used if scenario 1 using the Morris analysis parameters and there the topN ranked ones right? now I see why the (2) scenario is explained below

paramBounds, paramSelected = optimisationUtils.loadVariationData(cfgOpt, inDir, avalancheDir)

# Calculate Areal indicators and AIMEC and save the results in Outputs/ana3AIMEC and Outputs/out1Peak
optimisationUtils.calcArealIndicatorsAndAimec(cfgOpt, avalancheDir, ana3AIMEC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for the areal indicators, the settings are read from cfgOpt for aimec from the aimecCfg - here we could use the override config functionality

plt.close(fig)


def saveBestorCurrentModelrun(finalDF, paramSelected, ei=None, lcb=None, simName=None, csv_path='dummy.csv'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saveBestOrSpecificSimulation ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants